The GAAIN Entity Mapper: An Active-Learning System for Medical Data Mapping
نویسندگان
چکیده
This work is focused on mapping biomedical datasets to a common representation, as an integral part of data harmonization for integrated biomedical data access and sharing. We present GEM, an intelligent software assistant for automated data mapping across different datasets or from a dataset to a common data model. The GEM system automates data mapping by providing precise suggestions for data element mappings. It leverages the detailed metadata about elements in associated dataset documentation such as data dictionaries that are typically available with biomedical datasets. It employs unsupervised text mining techniques to determine similarity between data elements and also employs machine-learning classifiers to identify element matches. It further provides an active-learning capability where the process of training the GEM system is optimized. Our experimental evaluations show that the GEM system provides highly accurate data mappings (over 90% accuracy) for real datasets of thousands of data elements each, in the Alzheimer's disease research domain. Further, the effort in training the system for new datasets is also optimized. We are currently employing the GEM system to map Alzheimer's disease datasets from around the globe into a common representation, as part of a global Alzheimer's disease integrated data sharing and analysis network called GAAIN. GEM achieves significantly higher data mapping accuracy for biomedical datasets compared to other state-of-the-art tools for database schema matching that have similar functionality. With the use of active-learning capabilities, the user effort in training the system is minimal.
منابع مشابه
GEM: The GAAIN Entity Mapper
We present a software system solution that significantly simplifies data sharing of medical data. This system, called GEM (for the GAAIN Entity Mapper), harmonizes medical data. Harmonization is the process of unifying information across multiple disparate datasets needed to share and aggregate medical data. Specifically, our system automates the task of finding corresponding elements across di...
متن کاملMindLab-UNAL: Comparing Metamap and T-mapper for Medical Concept Extraction in SemEval 2014 Task 7
This paper describes our participation in task 7 of SemEval 2014, which focuses on analysis of clinical text. The task is divided into two parts: recognizing mentions of concepts that belong to the UMLS (Unified Medical Language System) semantic group disorders, and mapping each disorder to a unique UMLS CUI (Concept Unique Identifier), if possible. For identifying and mapping disorders belongi...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملA Novel Approach to Conditional Random Field-based Named Entity Recognition using Persian Specific Features
Named Entity Recognition is an information extraction technique that identifies name entities in a text. Three popular methods have been conventionally used namely: rule-based, machine-learning-based and hybrid of them to extract named entities from a text. Machine-learning-based methods have good performance in the Persian language if they are trained with good features. To get good performanc...
متن کاملComparing the Effects of Concept Mapping and Integration Method on Nursing Students' Learning in Nursing Process Course in Tabriz University of Medical Sciences
Introduction: To analyze patients' problems and make an appropriate care plan, nursing students need a deep and meaningful learning. Therefore, it is better to choose educational methods which are capable of educating nursing students in such learning level. The aim of this study was to compare the effect of concept mapping and integration model on nursing students' learning in nursing process ...
متن کامل